Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update from origin 2020-06-18 #4

Merged
merged 35 commits into from
Jun 18, 2020
Merged

Conversation

richkadel
Copy link
Owner

No description provided.

lenary and others added 30 commits May 30, 2020 18:24
We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In #61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in #61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

This commit does ensure that the standard library is built with unwind
tables, so that users do not need to rebuild the standard library in
order to get a backtrace that includes standard library calls (which is
the original reason for forcing frame pointers).
* Check for overflow when calculating the slice start & end position.
* Align the pointer obtained from the allocator, ensuring that it
  satisfies user requested alignment (the allocator is only asked for
  layout compatible with u8 slice).
* Remove an incorrect assertion from DroplessArena::align.
Return a pointer from `alloc_raw` instead of a slice. There is no
practical use for slice as a return type and changing it to a pointer
avoids forming references to an uninitialized memory.
When we're running with dry_run enabled (i.e. all builds do this initially), we're
guaranteed to save of a toolstate of TestFail for tools that aren't tested. In practice,
we do test tools as well, so for those tools we would initially record them as being
TestPass, and then later on re-record the correct state after actually testing them.
However, this would not work well if the build failed for whatever reason (e.g. panicking
in bootstrap, or as was the case in 73097, clippy failing to test successfully), we would
just go on believing that things passed when they in practice did not.

This commit also adjusts saving toolstate to never record clippy explicitly (otherwise, it
would be recorded when building it); eventually that'll likely move to other tools as well
but not yet. This is deemed simpler than checking everywhere we generically save
toolstate.

We also move clippy out of the "toolstate" no-fail-fast build into a separate x.py
invocation; this should no longer be technically required but provides the nice state of
letting us check toolstate for all tools and only then check clippy (giving full results
on every build).
store `ObligationCause` on the heap

Stores `ObligationCause` on the heap using an `Rc`.

This PR trades off some transient memory allocations to reduce the size of–and thus the number of instructions required to memcpy–a few widely used data structures in trait solving.
Avoid prematurely recording toolstates

When we're running with dry_run enabled (i.e. all builds do this initially), we're
guaranteed to save of a toolstate of TestFail for tools that aren't tested. In practice,
we do test tools as well, so for those tools we would initially record them as being
TestPass, and then later on re-record the correct state after actually testing them.
However, this would not work well if the build failed for whatever reason (e.g. panicking
in bootstrap, or as was the case in #73097, clippy failing to test successfully), we would
just go on believing that things passed when they in practice did not.

This commit also adjusts saving toolstate to never record clippy explicitly (otherwise, it
would be recorded when building it); eventually that'll likely move to other tools as well
but not yet. This is deemed simpler than checking everywhere we generically save
toolstate.

We also move clippy out of the "toolstate" no-fail-fast build into a separate x.py
invocation; this should no longer be technically required but provides the nice state of
letting us check toolstate for all tools and only then check clippy (giving full results
on every build).

r? @oli-obk

Supercedes #73275, also fixes #73274
Apparently it got changed.
Check for overflow in DroplessArena and align returned memory

* Check for overflow when calculating the slice start & end position.
* Align the pointer obtained from the allocator, ensuring that it
  satisfies user requested alignment (the allocator is only asked for
  layout compatible with u8 slice).
* Remove an incorrect assertion from DroplessArena::align.
* Avoid forming references to an uninitialized memory in DroplessArena.

Helps with #73007, #72624.
Don't run generator transform when there's a TyErr

Not sure if this might cause any problems later on, but we shouldn't be hitting codegen or const eval for the produced MIR anyways, so it should be fine.

cc #72685 (comment)
…kinnison

Re-order correctly the sections in the sidebar

Before that, "trait implementations" and "implementors" titles in the sidebar were before "methods" for example. Which wasn't logical considering that the two sections come after in the "content".

r? @kinnison
… r=Mark-Simulacrum

Add more info to `x.py build --help` on default value for `-j JOBS`.
Use `Ipv4Addr::from<[u8; 4]>` when possible

Resolve this comment: #73331 (comment)
Fix forge-platform-support URL

Apparently it got changed.
bors added 5 commits June 16, 2020 14:58
Rollup of 8 pull requests

Successful merges:

 - #73237 (Check for overflow in DroplessArena and align returned memory)
 - #73339 (Don't run generator transform when there's a TyErr)
 - #73372 (Re-order correctly the sections in the sidebar)
 - #73373 (Use track caller for bug! macro)
 - #73380 (Add more info to `x.py build --help` on default value for `-j JOBS`.)
 - #73381 (Fix typo in docs of std::mem)
 - #73389 (Use `Ipv4Addr::from<[u8; 4]>` when possible)
 - #73400 (Fix forge-platform-support URL)

Failed merges:

r? @ghost
Update LLVM submodule

Only includes one commit:
- [D80759](https://reviews.llvm.org/D80759): Fix FastISel dropping srcloc metadata from InlineAsm

Fixes #40555
…uppe,Mark-Simulacrum

[RISC-V] Do not force frame pointers

We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In #61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in #61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

---

I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere.

There are more details on the code size savings seen in Tock here: tock/tock#1660
Fix link error with #[thread_local] introduced by #71192

r? @oli-obk
linker: Never pass `-no-pie` to non-gnu linkers

Fixes #73370
@richkadel richkadel merged commit 7ef9eb3 into richkadel:master Jun 18, 2020
richkadel added a commit that referenced this pull request Oct 5, 2020
This is a combination of 18 commits.

Commit #2:

Additional examples and some small improvements.

Commit #3:

fixed mir-opt non-mir extensions and spanview title elements

Corrected a fairly recent assumption in runtest.rs that all MIR dump
files end in .mir. (It was appending .mir to the graphviz .dot and
spanview .html file names when generating blessed output files. That
also left outdated files in the baseline alongside the files with the
incorrect names, which I've now removed.)

Updated spanview HTML title elements to match their content, replacing a
hardcoded and incorrect name that was left in accidentally when
originally submitted.

Commit #4:

added more test examples

also improved Makefiles with support for non-zero exit status and to
force validation of tests unless a specific test overrides it with a
specific comment.

Commit #5:

Fixed rare issues after testing on real-world crate

Commit #6:

Addressed PR feedback, and removed temporary -Zexperimental-coverage

-Zinstrument-coverage once again supports the latest capabilities of
LLVM instrprof coverage instrumentation.

Also fixed a bug in spanview.

Commit #7:

Fix closure handling, add tests for closures and inner items

And cleaned up other tests for consistency, and to make it more clear
where spans start/end by breaking up lines.

Commit #8:

renamed "typical" test results "expected"

Now that the `llvm-cov show` tests are improved to normally expect
matching actuals, and to allow individual tests to override that
expectation.

Commit #9:

test coverage of inline generic struct function

Commit #10:

Addressed review feedback

* Removed unnecessary Unreachable filter.
* Replaced a match wildcard with remining variants.
* Added more comments to help clarify the role of successors() in the
CFG traversal

Commit #11:

refactoring based on feedback

* refactored `fn coverage_spans()`.
* changed the way I expand an empty coverage span to improve performance
* fixed a typo that I had accidently left in, in visit.rs

Commit #12:

Optimized use of SourceMap and SourceFile

Commit #13:

Fixed a regression, and synched with upstream

Some generated test file names changed due to some new change upstream.

Commit #14:

Stripping out crate disambiguators from demangled names

These can vary depending on the test platform.

Commit #15:

Ignore llvm-cov show diff on test with generics, expand IO error message

Tests with generics produce llvm-cov show results with demangled names
that can include an unstable "crate disambiguator" (hex value). The
value changes when run in the Rust CI Windows environment. I added a sed
filter to strip them out (in a prior commit), but sed also appears to
fail in the same environment. Until I can figure out a workaround, I'm
just going to ignore this specific test result. I added a FIXME to
follow up later, but it's not that critical.

I also saw an error with Windows GNU, but the IO error did not
specify a path for the directory or file that triggered the error. I
updated the error messages to provide more info for next, time but also
noticed some other tests with similar steps did not fail. Looks
spurious.

Commit #16:

Modify rust-demangler to strip disambiguators by default

Commit #17:

Remove std::process::exit from coverage tests

Due to Issue rust-lang#77553, programs that call std::process::exit() do not
generate coverage results on Windows MSVC.

Commit #18:

fix: test file paths exceeding Windows max path len
richkadel pushed a commit that referenced this pull request Nov 29, 2020
Don't run `resolve_vars_if_possible` in `normalize_erasing_regions`

Neither `@eddyb` nor I could figure out what this was for. I changed it to `assert_eq!(normalized_value, infcx.resolve_vars_if_possible(&normalized_value));` and it passed the UI test suite.

<details><summary>

Outdated, I figured out the issue - `needs_infer()` needs to come _after_ erasing the lifetimes

</summary>

Strangely, if I change it to `assert!(!normalized_value.needs_infer())` it panics almost immediately:

```
query stack during panic:
#0 [normalize_generic_arg_after_erasing_regions] normalizing `<str::IsWhitespace as str::pattern::Pattern>::Searcher`
#1 [needs_drop_raw] computing whether `str::iter::Split<str::IsWhitespace>` needs drop
#2 [mir_built] building MIR for `str::<impl str>::split_whitespace`
#3 [unsafety_check_result] unsafety-checking `str::<impl str>::split_whitespace`
#4 [mir_const] processing MIR for `str::<impl str>::split_whitespace`
#5 [mir_promoted] processing `str::<impl str>::split_whitespace`
#6 [mir_borrowck] borrow-checking `str::<impl str>::split_whitespace`
#7 [analysis] running analysis passes on this crate
end of query stack
```

I'm not entirely sure what's going on - maybe the two disagree?

</details>

For context, this came up while reviewing rust-lang#77467 (cc `@lcnr).`

Possibly this needs a crater run?

r? `@nikomatsakis`
cc `@matthewjasper`
richkadel pushed a commit that referenced this pull request Mar 19, 2021
HWAddressSanitizer support

#  Motivation
Compared to regular ASan, HWASan has a [smaller overhead](https://source.android.com/devices/tech/debug/hwasan). The difference in practice is that HWASan'ed code is more usable, e.g. Android device compiled with HWASan can be used as a daily driver.

# Example
```
fn main() {
    let xs = vec![0, 1, 2, 3];
    let _y = unsafe { *xs.as_ptr().offset(4) };
}
```
```
==223==ERROR: HWAddressSanitizer: tag-mismatch on address 0xefdeffff0050 at pc 0xaaaad00b3468
READ of size 4 at 0xefdeffff0050 tags: e5/00 (ptr/mem) in thread T0
    #0 0xaaaad00b3464  (/root/main+0x53464)
    #1 0xaaaad00b39b4  (/root/main+0x539b4)
    #2 0xaaaad00b3dd0  (/root/main+0x53dd0)
    #3 0xaaaad00b61dc  (/root/main+0x561dc)
    #4 0xaaaad00c0574  (/root/main+0x60574)
    #5 0xaaaad00b6290  (/root/main+0x56290)
    #6 0xaaaad00b6170  (/root/main+0x56170)
    #7 0xaaaad00b3578  (/root/main+0x53578)
    #8 0xffff81345e70  (/lib64/libc.so.6+0x20e70)
    #9 0xaaaad0096310  (/root/main+0x36310)

[0xefdeffff0040,0xefdeffff0060) is a small allocated heap chunk; size: 32 offset: 16
0xefdeffff0050 is located 0 bytes to the right of 16-byte region [0xefdeffff0040,0xefdeffff0050)
allocated here:
    #0 0xaaaad009bcdc  (/root/main+0x3bcdc)
    #1 0xaaaad00b1eb0  (/root/main+0x51eb0)
    #2 0xaaaad00b20d4  (/root/main+0x520d4)
    #3 0xaaaad00b2800  (/root/main+0x52800)
    #4 0xaaaad00b1cf4  (/root/main+0x51cf4)
    #5 0xaaaad00b33d4  (/root/main+0x533d4)
    #6 0xaaaad00b39b4  (/root/main+0x539b4)
    #7 0xaaaad00b61dc  (/root/main+0x561dc)
    #8 0xaaaad00b3578  (/root/main+0x53578)
    #9 0xaaaad0096310  (/root/main+0x36310)

Thread: T0 0xeffe00002000 stack: [0xffffc0590000,0xffffc0d90000) sz: 8388608 tls: [0xffff81521020,0xffff815217d0)
Memory tags around the buggy address (one tag corresponds to 16 bytes):
  0xfefcefffef80: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffef90: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefa0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefb0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefc0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefd0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefe0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffeff0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
=>0xfefceffff000: a2  a2  05  00  e5 [00] 00  00  00  00  00  00  00  00  00  00
  0xfefceffff010: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff020: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff030: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff040: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff050: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff060: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff070: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff080: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
Tags for short granules around the buggy address (one tag corresponds to 16 bytes):
  0xfefcefffeff0: ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..
=>0xfefceffff000: ..  ..  c5  ..  .. [..] ..  ..  ..  ..  ..  ..  ..  ..  ..  ..
  0xfefceffff010: ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..
See https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#short-granules for a description of short granule tags
Registers where the failure occurred (pc 0xaaaad00b3468):
    x0  e500efdeffff0050  x1  0000000000000004  x2  0000ffffc0d8f5a0  x3  0200efff00000000
    x4  0000ffffc0d8f4c0  x5  000000000000004f  x6  00000ffffc0d8f36  x7  0000efff00000000
    x8  e500efdeffff0050  x9  0200efff00000000  x10 0000000000000000  x11 0200efff00000000
    x12 0200effe000006b0  x13 0200effe000006b0  x14 0000000000000008  x15 00000000c00000cf
    x16 0000aaaad00a0afc  x17 0000000000000003  x18 0000000000000001  x19 0000ffffc0d8f718
    x20 ba00ffffc0d8f7a0  x21 0000aaaad00962e0  x22 0000000000000000  x23 0000000000000000
    x24 0000000000000000  x25 0000000000000000  x26 0000000000000000  x27 0000000000000000
    x28 0000000000000000  x29 0000ffffc0d8f650  x30 0000aaaad00b3468
```

# Comments/Caveats
* HWASan is only supported on arm64.
* I'm not sure if I should add a feature gate or piggyback on the existing one for sanitizers.
* HWASan requires `-C target-feature=+tagged-globals`. That flag should probably be set transparently to the user. Not sure how to go about that.

# TODO
* Need more tests.
* Update documentation.
* Fix symbolization.
* Integrate with CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.